Feature Extraction and Selection in Speech Emotion Recognition

نویسندگان

  • Yixiong Pan
  • Peipei Shen
  • Liping Shen
چکیده

Speech Emotion Recognition (SER) is a hot research topic in the field of Human Computer Interaction (HCI). In this paper, we recognize three emotional states: happy, sad and neutral. The explored features include: energy, pitch, linear predictive spectrum coding (LPCC), Mel-frequency spectrum coefficients (MFCC), and Mel-energy spectrum dynamic coefficients (MEDC). A German Corpus (Berlin Database of Emotional Speech) and our self-built Chinese emotional databases are used for training the Support Vector Machine (SVM) classifier. Finally results for different combination of the features and on different databases are compared and explained. The overall experimental results reveal that the feature combination of MFCC+MEDC+ Energy has the highest accuracy rate on both Chinese emotional database(91.3%) and Berlin emotional database (95.1%). KeywordsSpeech Emotion; Automatic Emotion Recognition; SVM; Energy; Pitch; LPCC; MFCC;MEDC

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

A Real-Time Electroencephalography Classification in Emotion Assessment Based on Synthetic Statistical-Frequency Feature Extraction and Feature Selection

Purpose: To assess three main emotions (happy, sad and calm) by various classifiers, using appropriate feature extraction and feature selection. Materials and Methods: In this study a combination of Power Spectral Density and a series of statistical features are proposed as statistical-frequency features. Next, a feature selection method from pattern recognition (PR) Tools is presented to e...

متن کامل

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Speech Emotion Recognition Using Radon and Discrete Cosine Transform-Based Features from Speech Spectrogram

Speech Emotion Recognition (SER) is a multi-disciplinary research area that has received increased attention over the last years. The aim of a SER system is to recognize human emotion by analyzing acoustics of speech sound to improve the voice-based human-machine interactions. This study presents a new feature extraction technique for SER using Radon transform and discrete cosine transform (DCT...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012